In [1]:
import pandas as pd
import numpy as np

## Worklist prioritization: Emergency Setting

In [2]:
## First, read in the file of the current worklist with the probabilities that your two algorithms have
## generated for the two types of findings you're most concerned with:

worklist = pd.read_csv('probabilities.csv')

In [3]:
worklist.head()

Unnamed: 0,Image_Type,Brain_bleed_probability,Aortic_dissection_probability
0,chest_xray,0.0,0.05
1,chest_xray,0.0,0.17
2,chest_xray,0.0,0.0
3,chest_xray,0.0,0.04
4,wrist_xray,0.0,0.0


Here, I'm just creating a new column to address the first question in the exercise, showing that every image taking 6 minutes to read will be read in the order that they are presented in this list.

In [4]:
worklist['time_to_read'] = np.arange(6, 6*(len(worklist)+1),6)

In [5]:
worklist.head()

Unnamed: 0,Image_Type,Brain_bleed_probability,Aortic_dissection_probability,time_to_read
0,chest_xray,0.0,0.05,6
1,chest_xray,0.0,0.17,12
2,chest_xray,0.0,0.0,18
3,chest_xray,0.0,0.04,24
4,wrist_xray,0.0,0.0,30


Now, for each image, I want to see if brain bleed or aortic dissection are likely. I'll create a new column showing the max probability between the two of them

In [6]:
worklist['max_prob'] = worklist[["Brain_bleed_probability", "Aortic_dissection_probability"]].max(axis=1)

In [7]:
worklist.head()

Unnamed: 0,Image_Type,Brain_bleed_probability,Aortic_dissection_probability,time_to_read,max_prob
0,chest_xray,0.0,0.05,6,0.05
1,chest_xray,0.0,0.17,12,0.17
2,chest_xray,0.0,0.0,18,0.0
3,chest_xray,0.0,0.04,24,0.04
4,wrist_xray,0.0,0.0,30,0.0


Great, now I want to re-order my worklist based on probabilities of critical findings:

In [8]:
worklist_prioritized = worklist.sort_values(by=['max_prob'],ascending=False)

In [9]:
worklist_prioritized.head()

Unnamed: 0,Image_Type,Brain_bleed_probability,Aortic_dissection_probability,time_to_read,max_prob
25,head_ct,0.99,0.0,156,0.99
15,chest_xray,0.0,0.95,96,0.95
10,chest_xray,0.0,0.94,66,0.94
75,chest_xray,0.0,0.93,456,0.93
47,chest_xray,0.0,0.93,288,0.93


In [10]:
worklist_prioritized['time_to_read_prioritized'] = np.arange(6, 6*(len(worklist)+1),6)

In [11]:
worklist_prioritized['time_delta'] = worklist_prioritized['time_to_read'] - worklist_prioritized['time_to_read_prioritized']

In [12]:
worklist_prioritized.head()

Unnamed: 0,Image_Type,Brain_bleed_probability,Aortic_dissection_probability,time_to_read,max_prob,time_to_read_prioritized,time_delta
25,head_ct,0.99,0.0,156,0.99,6,150
15,chest_xray,0.0,0.95,96,0.95,12,84
10,chest_xray,0.0,0.94,66,0.94,18,48
75,chest_xray,0.0,0.93,456,0.93,24,432
47,chest_xray,0.0,0.93,288,0.93,30,258


Now, I want to find places where my algorithm saved at least 30 minutes for brain bleeds:

In [13]:
worklist_prioritized[((worklist_prioritized.time_delta>30)&(worklist_prioritized.Image_Type=='head_ct'))]

Unnamed: 0,Image_Type,Brain_bleed_probability,Aortic_dissection_probability,time_to_read,max_prob,time_to_read_prioritized,time_delta
25,head_ct,0.99,0.0,156,0.99,6,150
84,head_ct,0.91,0.0,510,0.91,36,474
95,head_ct,0.9,0.0,576,0.9,42,534
42,head_ct,0.89,0.0,258,0.89,48,210
59,head_ct,0.89,0.0,360,0.89,54,306
89,head_ct,0.78,0.0,540,0.78,96,444
39,head_ct,0.77,0.0,240,0.77,102,138
45,head_ct,0.75,0.0,276,0.75,108,168
76,head_ct,0.69,0.0,462,0.69,144,318
96,head_ct,0.45,0.0,582,0.45,198,384


Looks like there are 14 head CTs that were read more than 30 minutes faster than their original order. All but the last three had a probability of brain bleed < 0.4.

Do the same analysis for saving at least 15 minutes with aortic dissections: 

In [14]:
worklist_prioritized[((worklist_prioritized.time_delta>=15)&(worklist_prioritized.Image_Type=='chest_xray'))]

Unnamed: 0,Image_Type,Brain_bleed_probability,Aortic_dissection_probability,time_to_read,max_prob,time_to_read_prioritized,time_delta
15,chest_xray,0.0,0.95,96,0.95,12,84
10,chest_xray,0.0,0.94,66,0.94,18,48
75,chest_xray,0.0,0.93,456,0.93,24,432
47,chest_xray,0.0,0.93,288,0.93,30,258
48,chest_xray,0.0,0.84,294,0.84,60,234
38,chest_xray,0.0,0.83,234,0.83,66,168
62,chest_xray,0.0,0.82,378,0.82,72,306
87,chest_xray,0.0,0.82,528,0.82,78,450
44,chest_xray,0.0,0.81,270,0.81,84,186
85,chest_xray,0.0,0.79,516,0.79,90,426


In [15]:
len(worklist_prioritized[((worklist_prioritized.time_delta>=15)&(worklist_prioritized.Image_Type=='chest_xray'))])

28

Looks like there are 28 chest x-rays that were read more than 15 minutes faster than their original order. All but the last nine had a probability of aortic dissection < 0.4.

Finally, I'll take a look at anywhere that my algorithm made brain bleeds or aortic dissections with a probability of 0.5 or higher be read _slower._

In [None]:
worklist_prioritized[((worklist_prioritized.time_delta<0)&(worklist_prioritized.max_prob>=0.5))]

Looks like there were two cases where my algorithm caused an image to be read slower than the priority order it came in. Given that I had images with probabilities <0.5 that were read faster, it is definitely possible to improve my algorithm by adding some more heuristics. 